Skip to content

[AscendNPU-IR][A5] surpport for cumsum interleave and deinterleave#905

Open
james1314520 wants to merge 2 commits intotile-ai:npuir-devfrom
james1314520:npuir-dev-22
Open

[AscendNPU-IR][A5] surpport for cumsum interleave and deinterleave#905
james1314520 wants to merge 2 commits intotile-ai:npuir-devfrom
james1314520:npuir-dev-22

Conversation

@james1314520
Copy link
Copy Markdown

@james1314520 james1314520 commented Apr 22, 2026

support for cumsum interleave and deinterleave

@github-actions
Copy link
Copy Markdown

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run bash format.sh in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work!

🚀

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request migrates the Cumsum, Interleave, and Deinterleave operations from the hivm dialect to the hfusion dialect in the NPU codegen and adds associated unit tests. However, the implementation of VinterleaveCodegen and VdeinterleaveCodegen contains critical issues: they use memref types where tensor types are expected and incorrectly overwrite destination buffers, leading to potential data loss when dealing with subregions. Additionally, the DeinterleaveOp is missing the necessary channel_nums attribute.

Comment thread src/target/codegen_npuir_dev.cc Outdated
Comment on lines +2586 to +2593
Value dst = GenSubviewFromRegion(npuirop.dst, npuirop.dst_range);
builder.create<mlir::hivm::VInterleaveOp>(
builder.getUnknownLoc(), TypeRange{}, srcs_vr, dst,
static_cast<int64_t>(npuirop.channel_nums));
mlir::Type outputType = dst.getType();

auto interleaveOp = builder.create<mlir::hfusion::InterleaveOp>(
builder.getUnknownLoc(),
outputType,
srcs_vr);
SetVarValue(npuirop.dst, interleaveOp->getResult(0));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The current implementation of VinterleaveCodegen has several issues:

  1. Type Mismatch: It uses GenSubviewFromRegion, which returns a memref, but hfusion operations and the SetVarValue mapping system in this codegen are designed for tensors. You should use GenExtractSliceFromRegion instead.
  2. Partial Update Bug: By calling SetVarValue directly with the result of the interleave operation, you are overwriting the entire destination buffer with a tensor that only represents the subregion (if dst_range is not the full buffer). This will cause data loss for the rest of the buffer. You must use ReshapeCastAndInsertSlice to correctly update the destination tensor.
  3. Source Types: Note that the srcs vector (populated in the context above) also contains memrefs from GenSubviewFromRegion. These should ideally be converted to tensors as well for compatibility with hfusion ops.
  mlir::Value dst_tensor = GenExtractSliceFromRegion(npuirop.dst, npuirop.dst_range);

  auto interleaveOp = builder.create<mlir::hfusion::InterleaveOp>(
    builder.getUnknownLoc(),
    dst_tensor.getType(),
    srcs_vr);
  mlir::Value result = ReshapeCastAndInsertSlice(interleaveOp->getResult(0), GetVarValue(npuirop.dst), npuirop.dst_range);
  SetVarValue(npuirop.dst, result);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are important to address.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread src/target/codegen_npuir_dev.cc Outdated
Comment on lines +2602 to +2613
Value dst = GenSubviewFromRegion(npuirop.dsts[i], npuirop.dsts_range[i]);
dsts.push_back(dst);
}
mlir::ValueRange dsts_vr(dsts);
auto channel_nums = mlir::IntegerAttr::get(
builder.getI64Type(), static_cast<int64_t>(npuirop.channel_nums));
mlir::hivm::DeinterleaveModeAttr index_mode =
mlir::hivm::DeinterleaveModeAttr::get(
&context, NPUIR_STR_DEINTERLEAVEMODE[npuirop.index_mode]);
builder.create<mlir::hivm::VDeinterleaveOp>(builder.getUnknownLoc(),
TypeRange{}, src, dsts_vr,
channel_nums, index_mode);

int64_t current_channel_idx = i;
auto channelIdxAttr = builder.getI64IntegerAttr(current_channel_idx);

auto deinterleaveOp = builder.create<hfusion::DeinterleaveOp>(
builder.getUnknownLoc(),
dst.getType(),
src,
channelIdxAttr
);
SetVarValue(npuirop.dsts[i], deinterleaveOp->getResult(0));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The VdeinterleaveCodegen implementation contains critical issues similar to VinterleaveCodegen:

  1. Data Loss: SetVarValue overwrites the full destination buffer with a subregion-sized tensor. Use ReshapeCastAndInsertSlice to perform a proper tensor.insert_slice into the destination.
  2. Type Mismatch: Use GenExtractSliceFromRegion to obtain a tensor instead of a memref for the destination type.
  3. Missing Attribute: The DeinterleaveOp appears to be missing the total number of channels (npuirop.channel_nums). Without this, the operation cannot determine the correct stride for deinterleaving.
  4. Namespace Consistency: Use mlir::hfusion::DeinterleaveOp to match the style used in other parts of the file.
    mlir::Value dst_tensor = GenExtractSliceFromRegion(npuirop.dsts[i], npuirop.dsts_range[i]);

    int64_t current_channel_idx = i;
    auto channelIdxAttr = builder.getI64IntegerAttr(current_channel_idx);
    auto channelNumsAttr = builder.getI64IntegerAttr(npuirop.channel_nums);

    auto deinterleaveOp = builder.create<mlir::hfusion::DeinterleaveOp>(
        builder.getUnknownLoc(),
        dst_tensor.getType(),
        src,
        channelIdxAttr,
        channelNumsAttr
    );
    mlir::Value result = ReshapeCastAndInsertSlice(deinterleaveOp->getResult(0), GetVarValue(npuirop.dsts[i]), npuirop.dsts_range[i]);
    SetVarValue(npuirop.dsts[i], result);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also these

@baberrehman baberrehman added AscendNPU-IR Want to merge into the npuir branch A5 labels Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A5 AscendNPU-IR Want to merge into the npuir branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants